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ABSTRACT 

We study the detectability of the cross-correlation between 21 cm emission from the intergalactic 
medium and the galaxy distribution during (and before) reionization. We show that first-generation 
21 cm experiments, such as the Mileura Widefield Array (MWA), can measure the cross-correlation 
to a precision of several percent on scales k ~ 0.1 Mpc" 1 if combined with a deep galaxy survey 
detecting all galaxies with m > 10 10 M over the entire ~ 800 square degree field of view of the 
MWA. LOFAR can attain even better limits with galaxy surveys covering its ~ 50 square degree field 
of view. The errors on the cross-power spectrum scale with the square root of the overlap volume, so 
even reasonably modest surveys of several square degrees should yield a positive detection with either 
instrument. In addition to the obvious scientific value, the cross-correlation has four key advantages 
over the 21 cm signal alone: (1) its signal-to-noise exceeds that of the 21 cm power spectrum by 
a factor of several, allowing it to probe smaller spatial scales and perhaps to detect inhomogeneous 
reionization more efficiently; (2) it allows a cleaner division of the redshift-space distortions (although 
only if the galaxy redshifts are known precisely); (3) by correlating with the high-redshift galaxy 
population, the cosmological nature of the 21 cm fluctuations can be determined unambiguously; and 
(4) the required level of foreground cleaning for the 21 cm signal is vastly reduced. 

Subject headings: cosmology: theory - intergalactic medium - galaxies: high-redshift 



1. INTRODUCTION 

The epoch of reionization is one of the landmark events 
in structure formation, because it defines the moment at 
which the small fraction of material bound inside galaxies 
affected each and every baryon in the Universe. As such 
reionization is an excellent tracer of the early generations 
of structure formation, offering a window into the prop- 
erties of the first stars, the formation of the first quasars, 
and the growth of galactic systems. It is also a crucial 
event because of its effects on the galaxies themselves, 
suppressing the formation of small systems and affecting 
the formation of galaxies like our own. All of this has 
made the epoch of reionization one of the frontiers of 
modern astrophysics. 

Observations are now beginning to probe reionization, 
but it remains mysterious. High-redshift quasars selected 
from the Sloan Digital Sky Survey may indicate a rapidl y 
increasing neutral fraction at z ~ 6 ([Fan et alj I2006D, 
although that conclusion is model-depend ent (jSongailal 
l200llLidz et~a!][2006allBecker et al.ll2006h . Specific fea- 
tures in some of the spectra may also point toward a 
neutr al fraction xhi ^ 0-1 along at least some lines of 
sight dMesinger fc Haimaij 12004: iWvithe fc Loebl 12004 
IWvithe et all 120051 : iMesinger fe Haimanll2006h .On the 
other hand, the abundance of Lya-emitting galaxies at 
z = 6.56 argues strongly ag ainst a predominantly neu- 
tral Universe at that t ime dMalhotra fc Rhoadsj 12004 
iFurlanetto et al.ll2006d : iMalhotra fc Rhoadsl 120061 ). and 
the cosmic microwave background (CMB) polarization 
observed b y the Wilkinson M icrowave Anisotropy Probe 
(WMAP) (jPage et al.ll2006[ ) points toward reionization 
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beginning as early as z > 8-10. 

Disentangling this knot will require new techniques to 
observe the high-redshift Universe. Two of the most 
promising are galaxy surveys and the 21 cm line. The 
merits of the former are obvious, because such obser- 
vations reveal the detailed properties of the ionizing 
sources. The 21 cm line is perhaps the most exciting 
possibility for directly studying the reionization of the in - 
tergalactic medium (IGM) ; see IFurlanetto et al.l ([2006b) 
for a recent review. While it remains neutral, the IGM 
will be a net emitter or absorber of redshifted 21 cm 
photons from the CMB provided that its spin tempera- 
ture differs from the CMB temperature, a condition that 
should be satisfied well befo re reionization is complete 
(|Sethill2005t rFurlanettoll20"ol . In that case, fluctuations 
in the 21 cm brightness trace fluctuations in the den- 
sity, ionized fraction, and spin temperature of the IGM, 
allowing (in an ideal world) a tomograph ic reconstruc- 
tion of the history of structure formation (|Madau et al.l 
Il997| ). Unfortunately, the experimental challenges are 
formidable indeed, especially given the extremely bright 
Galactic and extragalactic foregrounds at the low fre- 
quencies (v < 200 MHz) relevant for these observations. 
Cleaning these foregrounds will require great care in cal- 
ibration and sophistica ted data analysis algor ithms (see 
the discussion in §9 of IFurlanetto et al1l2006rJh 

One interesting question is how these two datasets can 
be combined to reveal even more information about the 
high-redshift Universe. The potential synergy is obvious, 
given the complementarity between st udying the ion- 
izing sources and (to-be-ionized) IGM. IWvithe fc Loebl 
(2006) made a first step in this direction by showing 
that first-generation 21 cm surveys, together with exist- 
ing galaxy surveys, may be able to distinguish "inside- 
out" and "outside-in" reionization scenarios (in which, 
respectively, over- and underdense gas is ionized first). 
But of course even more is possible. For example, the 
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ionizing efficiency could vary with galaxy mass (or any 
other parameter), so that some subset of the galaxy 
population is responsible for most of reionization. The 
struc ture of the HII regions will depend upon suc h fac- 
tors (jFurlanetto et alJl2006at iMcQuinn et alJl2006f >. and 
we can most efficiently learn about them by compar- 
ing the galaxies to the 21 cm pattern. Furthermore, 
the small-scale fluctuations in the 21 cm signal depend 
on the interactions between the galaxies and the cosmic 
web su rrounding them (e.g.. lZahn et al.ll2006HLidz et al.1 
2006b). Cross-correlation also has some useful proper- 
ties on the data analysis side: because only a small frac- 
tion of the 21 cm foreground originates from high red- 
shifts, cross-correlation with the galaxies (known to be 
at high-redshift) greatly eases the foreground removal re- 
quirements and offers unambiguous confirmation of the 
cosmological signal. 

In this paper, we will study the detectability of this 
cross-correlation and quantify how well it can be mea- 
sured with a variety of 21 cm and galaxy surveys. Note 
that we will not explore the astrophysics content of the 
cross-correlation: we will take the simplest possible sig- 
nal and consider how well one can measure it in realis- 
tic experiments. We defer a detailed ex amination of the 
scien ce return to future work (also see IWvithe fc Loebl 
2006). In fj2j we describe our simple model for the signal 
and how we calculate the errors in the cross-correlation. 
We present our results for the sensitivity in fJ3J including 
the spherically-averaged power, the effect of redshift er- 
rors in the galaxy survey, and the sensitivity to redshift- 
space distortions. We then consider the effect of 21 cm 
foregrounds in []4l and finally we conclude in fj5] 

In our numerical calculations, we assume a cosmol- 
ogy with n m = o.26, n A = o.74, n b = 0.044, h = 

100ft, km s" 1 Mpc" 1 (with h = 0.74), n = 0.95, and 
os = 0.8, consistent with the most recent measurements 
(jSpergel et al.ll2006l ). 3 Unless otherwise specified, we use 
comoving units for all distances. 

2. METHOD 

2.1. The Signal 

The 21 cm power spectrum and the galaxy power spec- 
trum are themselves complicated beasts, and we expect 
their cross-correlation to be even more complex because 
galaxies source most of the fluctuations in the bright- 
ness temperature (especially HII regions, whose strong 
contrast and complex shapes dominate the fluctuations 
throughout reionization). Of course, this makes it a mag- 
nificent probe of the interactions between galaxies and 
the IGM; unfortunately, it also makes the signal difficult 
to model robustly and transparently. Because we are not 
concerned here with the detailed astrophysics underlying 
the cross-correlation but rather in its overall detectabil- 
ity, we will take the simplest possible model for it. We 
assume that both the 21 cm brightness and the galaxy 
distribution trace the linear density power spectrum Pgg. 
We thus neglect perturbations to the 21 cm signal from 
ionized regions (as well as inhomogeneous heating and 
spin temperature coupling). This is obviously a dras- 
tic oversimplification, but it allows us to examine the 
prospects for detection in a straightforward manner. We 

3 Note that we have increased as above the best fit WMAP value 
in order to improve agreement with weak lensing measurements. 



also neglect nonlinear corrections. For the density field, 
these become important at k > 5 Mpc -1 , near the upper 
range of the scales we will consider. However, galaxies 
are so highly biased that nonlinear effects set in even 
on large scales and may affect the detailed shape of the 
power spectrum through much of the range we consider. 
With these simplifications, the cross-power spectrum be- 
tween the galaxy field and the 21 cm brightness temper- 
ature can be written 

P 21)9 (fc, M ,z) = {l + ^ 2 )(l + ^ 2 /b)b6T b P g g(k,z), (1) 

where k is the wavenumber, /i is the cosine of the angle 
between k and the line of sight, k = |k|, (3 ~ il m (z) ' 6 , 
b is the mean galaxy bias within the sample, STf, ~ 
20xhi rnK is the mean 21 cm bright ness temperature of 
the IGM (see lFurlanetto et"a l. 2006b for details), and xm 
is the globally-averaged neutral fraction (which we as- 
sume to be unity throughout). j3 accounts for the growth 
of velocity perturbations relative to those of density; the 
two factors describe these redshift-space distortions in 
the 21 cm and galaxy s i gnals, respectively (lKaiser|[l987l : 
iBharadwai fc Alii 12004 iBarkana fc Loebl l2005af > Note 
that we will quote results in terms of the power per log- 
arithmic interval, A 2 = fc 3 P(fc)/(27r 2 ). 

Again, we emphasize that equation ([1]) is a naive sim- 
plification. During reionization, galaxies seed HII re- 
gions, which introduce substantial - an d often dominant 
- fluct uations into the 21 cm signal (jFurlanetto et alj 
l2004bh . Reionization will modify our signal in two sig- 
nificant ways. First, of course, the galaxy positions and 
the 21 cm brightness will actually be anii-correlated 
because galaxies must sit inside ionized bubbles. Sec- 
ond, the HII regions grow to rather large sizes (easily > 
10 Mpc), amplifying the signal on relatively l arge scales 
(seelFurlanetto et alj|2004bt IZahn et alj|2006t Uliev et all 
l2l)06f r We will examine these physical effects in future 
work; for now, our forecasts can be viewed as the ability 
to rule out the "null hypothesis" that galaxies and the 21 
cm signal both simply trace the underlying density field, 
or in other words to detect the ionized zones around the 
sampled galaxy population. Our fractional errors are ac- 
tually conservative so long as the HII regions do indeed 
amplify the large-scale signal. 

2.2. Error Estimates 

Neglecting systematic effects such as foreground sub- 
traction, the errors on a measurement of the 21 cm power 
with true value P21) at a particular mode (fc,/x) are 
McQuinn et~aT1l2005l ) 

T 2 n 2 An / x 2 \ 2 

^i(M)=P 2 i(^) + ^^ 7 ■ (2) 

Stint n{k\_) \A e J 

Here T sys is the system temperature of the telescope, B 
is the total bandwidth of the measurement, t- mt is the 
total integration time, A is the wavelength of the obser- 
vation, and A e is the effective area of each telescope; the 
last factor is of order unity for antennae optimized to 
observe at the appropriate redshift. The distance to the 
survey volume is D and its radial width is AD. The 
array geometry enters through the factor n(k±), which 
is the density of baselines observing at the appropriate 
transverse wavevector, k± = (1 — /i 2 ) 1 / 2 /;;, normalized so 
that its integral over the half-plane is the total number 
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of baselines, N a (N a — l)/2, where N a is the number of 
independent elements in the array. This spans a range 
defined by the minimum and maximum baselines in the 
array; for example, k± <max = 2wL max /(DX), where L max 
is the maximum baseline distance in the array. In equa- 
tion |2|). the first term represents cosmic variance and 
the second thermal noise. 

The analogous errors on a galaxy survey (with 
true power spectru m P ga i) are ()Feldman et al.1 119941 : 
iTegmark et al]|1997l) 



5P ga x(k,fj) = P ga i(fc,/i) +n x e 



(3) 



where n ga i is the mean number density of galaxies in the 
survey, ku = /ifc is the component of fc along the line of 
sight, o T — ca z /H(z), and a z is the typical error in each 
rcdshift measurement. Here the first term is again cosmic 
variance, and the second term is a combination of shot 
noise (l/n ga i) and redshift errors, which smear the ob- 
served radial fluctua tions but leave the tran sverse power 
unaffected (see, e.g.. lSeo fe Eisensteinl 120031 ) . Given the 
difficulty of high- rcdshift galaxy surveys, we allow for the 
possibility of large photometric redshift errors below. 

The effective number density depends on the charac- 
teristics of the survey and on the (unknown) galaxy pop- 
ulation at high redshifts. Rather than attempt to model 
these in detail, we will take a simplistic approach and 
assume that all dark matter halos w ith m > w m j n are 
detect able by the survey. We use the iPress fe Schechterl 
(1974) mass function; this probably underestimates the 
number of large halos and so provides a conservat i ve es- 
timate for the errors ( Jang- Condell fe Hernquistl [200ll 
iReed et al.|[200lllliev et al.ll2006HZahn et al.ll2006l) . Our 
results can easily be rescaled to other mass functions by 
choosing m m i n to match our number density; we have 
n gal = 2.4 x 10~ 3 , 7.3 x 10~ 6 , and 7.2 x 10~ 7 Mpc~ 3 
for m min = 10 10 , 10 u , and 2 x 10 11 M at z = 8. The 
mean bias in equation fl) is then the mean bias of all the 
gala xies above thi s thres hold, computed using the stan- 
dard lMo fe White! (fl996h formula. We find b = 7.5, 11.8, 
and 13.7 for our three surveys. 

The error in the cross-correlation for a particular mode 
is then 

2[5P| 1 , 9 (A;,m)] =PiiJk, f x) + 6P 21 (k, f i)6P ga} (k,n). (4) 

The factor of two comes from only sampling the upper 
half-plane, because the power spectrum is the Fourier 
transform of a real- valued function. In most of the range 
we consider, the second term dominates by a large factor. 

To this point, we have considered each mode individ- 
ually; of course, in practice, we will bin them in both fc 
and /x. The number of modes available in an annulus of 
width (Afc, A/i) is 



27rfc 2 AfcA^ 
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where the last factor is the Fourier space resolution and 
the survey volume is V SUIV = D 2 AD{\ 2 / A e ). 4 We can 
then compute the net error within each bin by simply 

4 Here we implicitly assume that the minimum baseline is equal 
to the radius of one antenna - i.e., a filled core of antennae. Then 
the maximum spatial scale to which the interferometer is sensitive 
is oc D/V ' A e . 



adding the errors in inverse quadrature. For example, 
the errors on the spherically-averaged power spectrum 
are 
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where e = Afc/fc. 

The sum over fi extends over all detectable modes that 
fit inside the survey volume. Because we allow modes 
in only the half-plane, we must have < fj, < 1. A 
typical 21 cm survey is (in the flat-sky approximation) 
a thin rectangular slice of depth AD <C D: this is 
much different from a typical galaxy survey, in which 
the depth is large. For this reason, not all modes are 
available for extraction. In particular, fc|| jm i n = 2?t/AD, 
so /i max = mm(l, fc/fciimin)- In the transverse direction, 
galaxy surveys are in principle sensitive to infinitely large 
fcj_, but 21 cm surveys are limited by the maximum an- 
gular resolution of the telescope - determined by L mSLX . 
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max(0, 1 - fci, max /fc 2 ). 



3. RESULTS 



We are now in a position to estimate errors on 
the cross-power spectrum of the 21 cm sky and the 
galaxy field. We will use a fiducial 21 cm survey at 
z = 8 with similar parameters to the Mileura Wide- 
field Array (MWA) Low Frequency Demon strator (as in 
iMcQ.uinn etaD 120051 iBowman etaLll2005h . We take a 
total area A to t — 7000 m 2 spread over N a = 500 an- 
tennae, distributed in a circle of radius 0.75 km. We 
assume that each element is 4 m wide, and that they 
are closely packed within a filled core and distributed 
like r~ 2 outside of that core. We take tmt = 1000 hr, 
B = 12 MHz (note that this may be large enough for evo- 
lutionary effects to become important a cross the band, 
though we ignore that poss ibility here; Mc Quinn et al.l 
[20051 iBarkana fc Loebl [2005cT) . and T sys = 440 K. We 
divide the measurement into bins of logarithmic width 
e = 0.5. We will also show some estimates for a fic- 
tional array with a square kilometer of collecting area; 
although we will refer to it as the Square Kilometer Ar- 
ray (SKA), our choices do not actually correspond to any 
specific design proposal. We take .Atot = 1 km"" spread 
over N a = 5000 antennae, with a maximum baseline of 
2.5 km. The antennae are distributed in a similar pattern 
to those of the MWA. 

We will also assume that the accompanying galaxy sur- 
vey covers the entire volume of the 21 cm survey. This 
may be difficult in practice, because the field of view of 
the MWA is ~ 800 square degrees at this wavelength. 
Fortunately, using equation ([S]), it is relatively easy to 
transform our results to a more realistic case where the 
galaxy survey subtends only a fraction of the 21 cm sur- 
vey. By decreasing the effective survey volume, this in- 

— 1/2 

creases the error according to Wai.g V 3ur v ■ Thus con- 
fining the galaxy survey to a small but contiguous field 
simply increases the errors by the corresponding factor, 
and of course it also prevents the cross-correlation from 
measuring any modes wider than the galaxy field. How- 
ever, the latter effect is not nearly as important as one 
might expect, because (as we will see below) modes wider 
than the redshift depth are lost anyway. 
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Fig. 1. — Expected errors on the cross-power spectrum. The solid 
black curve is A| x for our simple model in which both galaxies 
and density fluctuations trace the underlying dark matter power 
spectrum. The three thin solid curves show the errors for the MWA 
combined with a galaxy survey reaching m m j n = 10 10 , 10 11 , and 
2 X 10 11 Mq (from bottom to top); the dashed curves show the 
corresponding errors for our SKA. In each case, we assume 1000 
hr observations at z = 8. The vertical dotted line shows the fore- 
ground cut for this 21 cm survey (corresponding to B = 12 MHz); 
modes leftward of this line are removed from the 21 cm power 
spectrum during foreground cleaning. 



A second strategy is sparse sampling: to distribute the 
galaxy observations across the entire 21 cm field, but 

—1/2 

with a small filling factor. In this case, the T4 urv scaling 
still approximately holds, although the window function 
complicates the constraints. In general, the cost of sparse 
sampling is that power can be aliased from high-fc modes. 
The optimal survey design wou ld depend on the details 
of the power spectrum (see, e.g.. lHeavens fc Tavlodll997t 
iKaiserl fl998l for discussions of similar issues in galaxy 
redshift surveys and weak lensing). 

3.1. Spherically- Averaged Cross-Power Spectra 

Figure Q] shows our estimate for the spherically- 
averaged cross-correlation signal, in the context of the 
simple model of tj2.11 5 In all cases we assume per- 
fect redshift information for the galaxy survey. In our 
simple model, the signal is proportional to the linear 
power spectrum, amplified by the galaxy bias and red- 
shift space distortions. The thin solid curves show our 
error estimates for the MWA, while the dashed curves 
show them for the SKA. Within each set, the curves 
assume that the associated galaxy survey has m m i n = 
2 x 10 11 , 10 u , and 10 10 M , from top to bottom. These 
have ~ 2400, 24, 000, and 8 million galaxies over the sur- 
vey volume, respectively. 

Ignoring systematics, the MWA would provide a mea- 
surement over the range 0.005 Mpc -1 < k < 1 Mpc -1 
(in about ten independent bins). The SKA could reach 
even smaller scales - although it is limited on larger scales 

Note that the signal shown here assumes m m i n = 10 10 M Q ; 
it is nearly proportional to the mean bias of the galaxies and so 
actually depends on the particular galaxy survey (see i|2.2l l. 



because our version actually has a somewhat smaller 
field of view than the MWA. However, as with the 21 
cm power sp ectrum, this range is m isleading because of 
foregrounds (jMcQuinn et al.l l2005f ). The dotted curve 
shows the wavenumber corresponding to fc||. m i n . On 
scales k < fen m j n , one must take into account discrete- 
ness in the Fourier transform - in particular, the only 
modes with hi < k\\ m i n that are permitted in our thin 
rectangular slice actually have fc|| = and so correspond 
to modes along the plane of the sky. Such modes can- 
not be separated from fluctuations in the astrophysical 
foregrounds; thus, in reality, only modes rightward of 
the dotted line are relevant. Even with the relatively 
large bandwidth we have chosen here, this substantially 
reduces the accessible range of scales. 

Nevertheless, the cross-correlation is still a promising 
probe. Figure ^Bp shows the error for the MWA in frac- 
tional terms; the thin solid, dashed, and dotted curves 
are identical to the three MWA-based surveys shown in 
Figure [TJ Accuracy near one percent can be achieved 
when the MWA is combined with a deep and wide galaxy 
survey. This is sufficiently precise that the galaxy sur- 
vey need not span the entire field of view. For example, 
reducing the areal coverage to ~ 1 square degree would 
still permit a marginal detection of the signal (especially 
if it is amplified by large ionized regions). This is al- 
ready achievable at slightly lower redshifts (see below) 
and would offer invaluable information about the high- 
redshift Universe, as well as a useful confirmation of the 
21 cm signal's cosmological origin. Moreover, even this 
relatively small size is perfectly adequate for recovering 
the full range of available fc-modes. A one square degree 
survey would have fc±, min ~ 0.04[10/(1 + z)] 2 Mpc -1 , 
rather near the inevitable cutoff from foregrounds any- 
way. 

If the 21 cm data is to be combined with a smaller 
galaxy survey, the extra field of view is essentially wasted. 
This can help to drive the design philosophy of 21 cm ar- 
rays: clearly for this purpose it is better to go deep over 
a small area rather than to simply add field of view. For 
example, an instrument like LOFA R is well-matched in 
this regard. iMcQuinn et alj (|2005[ ) showed that the sen- 
sitivities of the MWA and LOFAR to P21 are nearly iden- 
tical, although the field of view of LOFAR is only ~ 50 
square degrees (possibly split into several independent 
beams). It makes up the difference in sampled volume 
with its larger collecting area (about an order of mag- 
nitude larger than the MWA), so that each field is mea- 
sured much more precisely. Thus, for a small field galaxy 
survey, LOFAR would have errors ~ ^50/800 ~ 0.25 
smaller than the MWA. (LOFAR also has larger base- 
lines than the MWA, allowing it to probe deeper in fc 
space.) 

The shapes of these error curves can be understood 
through comparison to the thick curves, which show 
the fractional errors on the associated measurement of 
the 21 cm power spectrum (upper thick curve) and the 
galaxy power spectrum (lower thick curve), again assum- 
ing m m in = 10 10 M© and a z = 0. The large-scale errors 
are nearly identical between the two; this is because the 
uncertainty is dominated by cosmic variance at small k, 
and the survey volumes are assumed to be identical. The 
21 cm survey reaches peak sensitivity at k ~ 0.05 Mpc -1 , 
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Fig. 2. — Fractional errors on power spectra for the MWA at (a) z = 8 and (b) z = 6. The upper thick solid curves are for 21 cm surveys 
with our fiducial MWA parameters, while the lower thick solid curves are for galaxy surveys with m m i n = 10 10 Mq. The thin curves are 
for the cross-correlation between the two, with m m i n = 10 10 , 10 11 , and 2 X 10 11 Mq (solid, dashed, and dotted curves, respectively). The 
vertical dotted line shows the foreground cut for this 21 cm survey (corresponding to B = 12 MHz). 



near the foreground cutoff. At smaller scales, the er- 
rors increase rapidly for two reasons. First, the effective 
surface brightness sensitivity rapidly worsens, because 
the baseline density decreases. Second, the longest base- 
line of the MWA corresponds to k±, max rj 0.6 Mpc -1 . 
For k > fcj_,max, only modes that are inclined relative 
to the sky can be measured, so the eff ective sampling 
decreases rapidly toward s maller scales (jMcQuinn et al.l 
120051: iBowman et al.l [20051) . On the other hand, galaxy 
surveys (at least with o z = 0) have infinitely good resolu- 
tion in every direction, so they remain accurate to much 
smaller scales (where shot noise takes over). 

From equation ([4]), we would naively expect the errors 
on the cross-correlation to be approximately the geomet- 
ric mean of 8P21 and SP g . While this does appear to be 
the case near and below the foreground cut, the noise on 
smaller scales actually increases nearly as rapidly for the 
cross-correlation as for the 21 cm power spectrum itself. 
The reason is the mode sampling: beyond fcj_, m ax, the 21 
cm array can only measure a small fraction of the modes, 
and of course only the measured modes can be corre- 
lated with the galaxy information. Thus, while i"2i,g can 
be measured to scales several times smaller than the 21 
cm measurement, the improvement is less than may have 
been hoped and (regardless of the galaxy survey) there is 
a fixed limit at small scales. This suggests that increasing 
the maximum baseline length would offer quite substan- 
tial improvements in the cross-correlation measurement 
on small scales, even if n(k±) is relatively sparse - as 
indeed we see in the SKA curves in Figure [TJ 

Nevertheless, P2i, g has two advantages over measure- 
ments of P21. First, leveraging the galaxy information 
increases the signal-to-noise by a factor of a few on scales 
where cosmic variance can be ignored (at least for first- 
generation surveys like the MWA; for SKA surveys, the 
21 cm errors are already comparable to those in the 
galaxy survey, so the signal-to-noise is similar). Sec- 



ond, with a deep survey it extends the range of useful 
k by a factor of several. This is important for two rea- 
sons. First, Figure [2] shows that the dynamic range (in 
fc-space) of the 21 cm measurement is only about an or- 
der of magnitude. Any features from reionization are 
expected to be relatively broad, so ide ntifying them un- 
ambig uously will be relatively difficult (|Furlanetto et alj 
2004b). Extending the range by even a modest factor will 
be useful in interpreting the data. Second, many of the 
details of the interactions between the ionizing sources 
and the IGM - such as recombinations - are hidden in 
the small-scale power (jZahn et al.ll2006| ). 

Figured shows the corresponding estimates for z = 6, 
just below our current lower limits on the redshift of 
reionization (and near the upper limits of existing galaxy 
surveys). Here we assume T sys = 250 K, offering mod- 
est improvements to the 21 cm measurements. The ex- 
pected galaxy number density also increases (our surveys 
now have ~ 7.2 x 10 4 , 3.8 x 10 5 , and 3 x 10 7 galaxies in 
them). This presents the most optimistic case for de- 
tecting the cross-correlation (by assuming that the 21 
cm signal persists to the lowest possib le redshift) and 
can b e compared with the estimates of IWvithe fc Loebl 
( 2006) . They use a model for the cross-correlation in the 
presence of ionized regions to show that the "zero-point" 
offset between the mean brightness temperature in pixels 
with and without galaxies is measurable with the MWA 
and the existing Subaru deep field of Lyq-selected galax- 
ies at z = 6.56 ([Shimasaku et alj|2005t iKashikawa et al.l 
|2006| ). spanning ~ 0.25 square degrees and Az = 0.11 
(corresponding to Av « 2.5 MHz). 

We can use our results to estimate the detectabil- 
ity of the cross-correlation with such a "minimal" 
galaxy survey. After correcting for contamination from 
low er redshift objects, the field contains ~ 36 galax- 
ies ()Kashikawa et al.l l2006t ) : we would therefore expect 
~ 5 x 10 5 galaxies in the entire MWA field of view. To 
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Fig. 3. — Sensitivity to the cross-correlation when redshift errors 
are included. In each panel, the solid black curve is for our fiducial 
21 cm survey with the MWA at z = 8. The others show the 
errors on the cross-correlation, with the solid, short-dashed, long- 
dashed, and dotted curves assuming cr z = 0, 0.001, 0.01, and 0.1, 
respectively. We vary m m i n between the two panels as shown. The 
vertical dotted lines show the foreground cut for this 21 cm survey 
(corresponding to B = 12 MHz). 



within the limits of our simple model, the Subaru Deep 
Field is therefore similar to our m min — 10 11 M Q esti- 
mates. Scaling our results by the ratio of the volumes, 
we expect a fractional error ~ 1.3 in the best-measured 
bins. Because HII regions will enhance the signal, we 
therefore expect that even existing surveys can provide 
a basic detection of the cross-correlation and offer inter- 
esting comp lementary information to the 21 cm survey 
itself, as in IWvithe fc Loebl ((2006), provided of course 
that the IGM is still significantly neutral at z ~ 6.6. 

3.2. Redshift Errors 

To this point, we have correlated with spectroscopic 
surveys with infinitely good resolution. Figure [3] shows 
the effect of imperfect redshift measurements, again us- 
ing our fiducial measurements at z = 8. The thick solid 
lines show the errors in the 21 cm power spectrum mea- 
surement and are identical between the two panels. The 
other curves show errors on Pi\, g for m m m = 10 10 M© 
(Fig. Eh) and m min = 2 x 10 11 M Q (Fig.[§>). The thin 
solid, short-dashed, long-dashed, and dotted curves as- 
sume a z = 0, 0.1%, 1%, and 10%, respectively. Obvi- 
ously these errors have no effect on the measurement 
at large scales, because they only smear out small-scale 
power. Naively, one might expect their effects at small 
scales to be small as well: for the galaxy power spectrum 
itself, redshift errors have a relatively modest effect, be- 
cause transverse modes are entirely unaffected. 

However, Figure [3] shows that they have a more dra- 
matic effect on cross-correlation measurements. Again, 
this is because of the unusual mode sampling in the 
21 cm observations. For wavenumbers larger than that 
corresponding to a z (here k z ~ 1/oy ~ 0.3 Mpc -1 for 
er 2 = 0.01) the galaxy survey begins to lose sensitivity 
to the line of sight modes. Unfortunately, these line of 



sight modes are precisely those which the 21 cm survey 
measures best: most obviously, a 21 cm telescope is con- 
fined to line-of-sight modes on scales above fcj_ )lnax . Only 
when k < k± tUlax and k < k z are measurements possible, 
because in this regime both surveys can use the pristine 
transverse modes to make measurements. With any re- 
alistic errors from photometric redshifts, k z < k± :iaax 
- even for a small telescope like the MWA - so errors 
on the small-scale power spectrum will blow up. Note 
that this cannot be helped with a deeper galaxy survey: 
the exponential cutoff at k z is much more severe than 
the slow increase in shot noise (see eq. [3]). As a result, 
redshift errors are "optimally" configured to destroy the 
cross-correlation measurement. Accurate spectroscopic 
redshifts will be necessary to take full advantage of the 
cross-correlation. 

The ultimate limit on redshift accuracy is set by in- 
ternal motions within the galaxies; if, for example, Lya 
is used to measure the redshift, winds may displace the 
line in redshift spa ce by several hundred km s _1 (e.g., 
Shapl ev" et al.| [2003). Such winds would cause a z ~ 10~ 3 , 
shown by the short-dashed lines in Figure[3l Fortunately, 
these errors are small enough that they do not signifi- 
cantly degrade the cross-correlation measurement. 

3.3. The Anisotropic Power Spectrum 

According to equation |T]), redshift-space distortions 
introduce anisotropy into these power spectra. In the 
previous sections we have ignored this anisotropy by av- 
eraging over the line of sight angle \i. We did so because 
the sensitivity of first generation 21 cm experiments is 
heavily weighted along the frequency axis, so they span 
only a small domain in p, and will have difficulty extract - 
ing the anisotropic components (|McQuinn et al.l 12005) . 
Cross-correlation with galaxy surveys can improve these 
constraints by adding sensitivity on small scales, al- 
though we still must contend with the hard limit from 

To compute the resulting errors on the anisotropic 
power spectrum, we write 

P 21tg (k,n)=n P^(k) + fJ 2 P tl2 (k) + f Jl 4 P^(k) (7) 

and use the parameter set (P^o , P^i , P^ ) at each 
wavenumber k in th e Fisher matrix analysis (following 
iMcQuinn et al.ll2005f ). Figure 0] shows the resulting frac- 
tional errors. The solid black line is for the MWA 21 
cm survey alone (at z = 8). After foreground cleaning, 
it can measure the isotropic component to a precision of 
~ 10% over a limited range of scales, but the /j, 2 and /i 4 
components will only be weakly constrained (at best). 

The other curves show the analogous errors on the 
cross-correlation. The long-dashed curve assumes a 
galaxy survey with m m j n = 10 10 M Q and perfect red- 
shift information. All of the constraints are markedly 
better than for P21 itself, extending the useable range 
to smaller scales (by a factor of a few) and also increas- 
ing the peak precision by at least a factor of two. For 
the MWA, fc_L !max w 0.6 Mpc^ 1 : as expected, our sen- 
sitivity to the anisotropic components declines rapidly 
past this point. The galaxy survey helps to pull out the 
/^-dependence in the presence of noise, but of course it 
cannot reach beyond the angular resolution of the 21 cm 
array where no modes with small \x are available at all. 
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Fig. 4. — Fractional errors on the three /^-dependent compo- 
nents of the power spectrum. In each panel, the solid black curve 
is for our fiducial 21 cm survey with the MWA at z = 8. The 
others show the errors on the cross-correlation. The long-dashed 
lines take rrt m i n = 10 10 Mq and a z = 0, the short-dashed lines 
take m m ; n = 10 11 Mq and a z = 0, and the dot-dashed lines take 
m min = 10 10 Mq and a z = 0.1. The dotted curves show the er- 
rors using the SKA, assuming m m j n = 10 10 Mq and a z = 0. The 
vertical dotted lines show the foreground cut for this 21 cm survey 
(corresponding to B = 12 MHz). 



The short-dashed curves in Figure @] take a shallower 
survey with m m ; n = 10 11 M© and a z — 0, while the 
dot-dashed curves take m m i„ = 10 10 M Q and o z = 0.1. 
The isotropic component is relatively unaffected in both 
cases, but redshift errors at this level do significantly de- 
grade the ability to constrain the /i 2 and /z terms. Note, 
however, that 1% redshift errors have almost no effect on 
the measurement, because they distort the spectrum only 
at k > fc_i_, roax anyway. 

Finally, the dotted curve shows the errors on the three 
components for a cross-correlation between a galaxy sur- 
vey (with TO m in = 10 10 M© and a z — 0) and an SKA 
field. The constraints are worse at small k (because the 
sampled volume is smaller) but show a dramatic improve- 
ment at larger k. The SKA has significantly larger base- 
lines and retains sensitivity all the way to k ~ 3 Mpc -1 
in all the angular components. In this case, the errors on 
P2i,g and P21 are actually comparable, so the improve- 
ments offered by the cross-correlation are much less sig- 
nificant from a data analysis perspective - although of 
course their astrophysical content is still complementary. 

4. FOREGROUND CONTAMINATION 

To this point we have focused on the improved sensitiv- 
ity offered by the cross-correlation. It has the additional 
advantage of easing the requirements for foreground re- 
moval, because the only foregrounds that will survive the 
cross-correlation are those arising from the cosmological 
volume of the galaxy survey: free-free and synchrotron 
emission from the high-redshift galaxies. Here we will 
estimate how strong this contamination is and show that 
even the simplest foreground removal scheme should ad- 
equately remove the residuals. 

We will begin by calculating the free-free contamina- 



tion following the method of lOh fc Mackl (|2003f) . The 
free-free emissivity can be well-fit by 

e u = e n 2 T-°- 35 (8) 

where eo = 3.2 x 10~ 39 ergs cm 3 s _1 Hz -1 and T4 is the 
electron temperature in units of 10 4 K. The total free- 
free luminosity of a single galaxy is the volume integral 
over all its HII regions, oc Jn^dV. But of course the 
total recombination rate in the galaxy is also oc J n^dV 
(at least ignoring heavy elements). This allows us to 
relate the free-free luminosity Ljf to the total production 
rate of ionizing photons if we assume that recombinations 
are in equilibrium with ionizations: 
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(9) 



tions and ionizations in this galaxy, as is the case-B 
recombination coefficient, and / csc is the escape fraction 
of ionizing photons. 

Thus the mean brightness temperature of free-free 
emission from our survey volume is simply related to the 
total ionizing rate. We assume that the emissivity of ion- 
izing photons is proportional to the total rate at which 
gas collapses onto galaxies, 



eio„ = /JV 7 b — (l + z)H(z) 
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coll 



dz 



(10) 



where pb is the mean comoving density of baryons, / co ii is 
the fraction of matter in galaxy-sized halos, /* is the star 
formation efficiency, and _/V 7 f, is the number of ionizing 
photons produced per stellar baryon. Thus, the mean 
brightness temperature is (suppressing the temperature 
dependence of e„ and as) 
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where T is the mean brightness temperature if all the 
baryons were inside galaxies. (Note that 5Tg is actually 
independent of f2 S urv, although it does depend on the ra- 
dial depth.) For reference, at z = 8, assuming /* = 0.1, 
/esc =0.1, and N lh — 4000 (as appropriate for Popula- 
tion II stars), we find STq = 4.5 mK and T = 161 mK. 

Because the free-free spectrum is featureless, it con- 
tains no radial information and is most easily described 
in terms of the angular power spectrum. In our nota- 
tion, the two componen ts (clustering and Poisson) are 
then fe.g.. |Peebleslll980fl 

C ; c = T 2 / c 2 ollWi , (12) 
C^^/dJ™) 2 ^). (13) 

Ksurv J \ J 



Here n(m) is the halo mass function, the integration ex- 
tends over all galaxies, and wi is the appropriate coeffi- 
cient in the Legendre expansion of the angula r correlation 
function. We will take w(9) = (6>/6> )-°- 8 (lOh fc Mackl 
120031 ) where 6 = 2' is the correlation length. Then 
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We will also define Iq = 1/6q = 1719. In equation (flU)) . 
we have assumed that the free- free luminosity (and hence 



the ionizing luminosity) of each galaxy is proportional to 
its mass. Note that we have not excised any of the bright 
point sources here; doing so would effectively impose a 
finite limit on the integrals over halo mass. Evaluating 
these for our fiducial survey parameters, we find 
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The free-free contamination is straightforward to es- 
timate and provides a minimal level of contamination, 
but it is likely to be much smaller than the cumulative 
synchrotron emission from the same galaxies (produced 
by fast electrons in their interstellar media). Unfortu- 
nately, this component is also much more difficult to 
estimate robustly, because it depends on the details of 
magnetic field generation and cosmic ray acceleration in- 
side of galaxies. For a simple estimate, we assume that 
the observed correlation between synchrotron luminosity 
and star formation rate (SFR) in nearby galax ies applies 
equally well at high redshifts (|Yun et al 
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= 1.7 x 10 
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M yr- 



— erg s- 1 Hz- 1 , (17) 



where £i.4GHz is the specific synchrotron luminosity at 
a (rest) frequency of 1.4 GHz (fortunately, just the fre- 
quency we need for the contamination from the survey 
volume). The origin of this relation is unclear; one 
possibility is that the magnetic fields in starbursts are 
well a bove their minimal-energy value ([Thompson et all 
2006). Fortunately, for our simple estimate, its origin is 
not so important, and we will just use the empirical re- 
lation. In any case, by assuming that the star formation 
rate tracks the rate at which gas accretes onto galaxies, 
equation (TT7]) allows us to associate the synchrotron vol- 
ume emissivity to / co ii just as with free-free emission; we 
find 
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^^syn 
ST S ~ 1 - /c: 



(18) 



for our fiducial parameters. Thus, at least in the simplest 
models, the synchrotron foreground will be about an or- 
der of magnitude larger than the free-free foreground. 
Because both originate from the same source popula- 
tion, and because each is proportional to the star forma- 
tion rate, their fluctuation spectra will also have identical 
shapes. 

The fluctuation amplitude expected from the 21 cm 
line is a few mK; thus equation (fTSJ) shows that the 
residual contamination in the angular power spectrum 
is at least comparable to the 21 cm signal, and probably 
several times larger thanks to the synchrotron compo- 
nent. If we only had information at one frequency, there 
would be no way to separate these foregrounds and re- 
cover the cosmological signal. 6 But, with multichannel 

6 Actually, because the foreground contamination at a single fre- 
quency is proportional to the radial width of the survey, while the 
21 cm fluctuation power at that frequency is independent of the 
depth, the foreground contamination can be minimized by corre- 
lating with narrow redshift slices in the galaxy survey. However, 
such a method would eliminate modes along the line of sight and 
so is not as useful. 



measurements, we can take advantage of the fact that 
both synchrotron and free-free emission have smooth, 
nearly power-law spectra, while the 21 cm background 
varies rapidly. Conceptually, we can therefore choose 
one frequency slice and use it to calibrate the foreground 
contamination for all frequencies along each di rection 
(jZaldarriaga et al.ll2004t iMorales fe HewittJ I2004D . The 
remaining fluctuation power will then be the sum of the 
21 cm signal, errors in beam calibration, and deviations 
from the fit value to the foregrounds. These deviations 
are caused by variations in the spectral indices of sources 
in different pixels; if all the pixels had the same spectral 
index, this simple scheme would be perfect. 

In reality, the fit is done in Fourier space rather than 
real space, so we subtract a constant value from each 
"pixel." In the process, the fc|| = modes are lost 
(they correspond to modes along the plane of the sky), 
but those with fc|| > suffer only minor degradation. 7 
We can roughly quantify this residual power by estimat- 
ing the degree of correlation for the f oregrounds in two 
nearb y frequency channels v\ and v 2 (|Zaldarriaga et ail 
2004)' 



h(y\-,V2) 



Ci{y 1 ,v 2 ) 



ln 2 (^i/j/ 2 ), 
( 19 ) 

where in the simplest models (in which Poisson fluctua- 
tions dominate) is the scatter in the spectral indices 
of sources in the beam. For free-free sources, varia- 
tions originate from the electron temperature; however, 
even allowing electron tempe ratures across the entire 
range 10 4 -10 5 K, 8( ~ 0.03 dSantos et all 120051 ) . The 
dispersion in synchrotron sp ectra may be much larger, 
S( ~ 0.2 (jCohen et al.ll2004h . although these measure- 
ments spanned 74 MHz- 1.4 GHz. We are only concerned 
with the variations over a much smaller interval spanning 
~ 100 MHz (in the rest frequency). Some of the disper- 
sion over the observed frequency range is likely in the 
locations or magnitudes of spectral breaks well outside 
the band of interest, so we regard the lCohen et~ai1 (2004) 
measurement as an upper limit. Moreover, if each pixel 
contains many unresolved sources, we would expect the 
net (5£ to be even smaller. Thus the typical error in the 
21 cm power spectrum r esulting from our spectral fits is 
(|Zaldarriaga et al.ll2004f ) 



04 f_gji_j< >-'*l 150 MHz \ 2 

' V 40 mK °- 2 6 MHz v \ ) 

(20) 

where 6Tf g is the mean brightness of all the foregrounds 
that survive the cross-correlation. We see that the er- 
rors from foregrounds within the survey volume remain 
well below the signal for any reasonable frequency sep- 
aration, so even the simplest foreground removal algo- 
rithm - subtracting a constant from each k^ pixel - 
should suffice for measuring the 21 cm-galaxy cross- 
correlation. This should be contrasted with the 21 cm 
measurement on its own, which requires more sophisti- 

7 To this point we have ignored discreteness in the Fourier de- 
composition. In reality, because the survey volume is a narrow 
slice, the only modes that can exist with fcy smaller than the ra- 
dial width of the survey have fcy = 0. This is why, in Figs. [TH4l we 
argued that all modes with k < fcy min will be lost in the cleaning. 
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cated algorithms because it must contend with contam- 
ination from sources at all redshifts (e .g.. iSantos et al.1 
[20051 lWangetmi2005l iMcQuinn et aLll2005h . 

5. DISCUSSION 

We have studied the potential for measuring the cross- 
correlation between the redshifted 21 cm signal from the 
high-redshift IGM (during and before reionization) and 
the galaxy population. We have emphasized four advan- 
tages to this measurement that will complement (from a 
data analysis perspective) observations of fluctuations in 
the 21 cm signal. First, at least for the initial generation 
of experiments, the combination with a galaxy survey in- 
creases the expected signal-to-noise by a factor of a few. 
This allows measurements to extend to somewhat smaller 
scales and significantly increases the dynamic range (in k- 
space) of 21 cm observatories. Second, the galaxy survey 
modes have higher sensitivity along the plane of the sky 
and so improve measurements of the anisotropy of the 21 
cm signal. This will be useful in extracting fundamen- 
tal cosmological parameters, because the redshift space 
distortions are sourced directly by the dark matter distri- 
bution a nd are much more robus t to astrophysical unce r- 
tainties (jBarkana fe Loebll2005al : iMcQuinn et al.ll2005h . 

Third, by cross-correlating with galaxies (which we 
can unambiguously determine to be at the proper red- 
shift), this measurement confirms the reality of the cos- 
mological 21 cm signal. Given the enormous bright- 
ness of the Galactic and extragalactic foregrounds (eas- 
ily exceeding ~ 100-1000 K, compared to the ~ 10 mK 
sign al), and the complexity of the data analysis (see, 
e.g.. iFurlanetto et al.ll2006bl §9), this conceptual advan- 
tage should not be underestimated. Finally, measur- 
ing P2i,s drastically reduces the requirements for fore- 
ground cleaning, because only those foregrounds origi- 
nating from the survey volume (such a s free-free and syn- 
chrotron emission from the galaxies; lOh fc Mackll2003f) 
will survive the cross-correlation. We have shown that, 
although modes in the plane of the sky will remain con- 
taminated (albeit only at an order unity level), residual 
contamination of modes with a line-of-sight component 
will be small, even without any sophisticated cleaning 
algorithm. 

Although the ideal experiment would use a galaxy sur- 
vey spanning the entire volume of the 21 cm observation, 
the signal-to-noise requirements are lax enough that even 
a much smaller galaxy survey can be interesting. In fact, 
because 21 cm surveys are essentially thin redshift slices, 
modes wider than the redshift depth of the survey are lost 
anyway during foreground removal. Thus, provided that 
the angular size of the galaxy survey exceeds the redshift 
depth, there is no disadvantage (other than a loss of sta- 
tistical power) to using small fields - the same range of k 
modes is still available. A survey of a few square degrees 
- already achievable at z ~ 6-7 with present technology - 
would satisfy thi s requi rement. For example, as shown by 
IWvithe fc Loebl (|2006jh the existing S ubaru Deep Field 
obser vations of z = 6.56 Lya emitters (jKashikawa et al.1 
2006), in combination with an MWA field, could offer 
interesting constraints. 

On the other hand, we have also found that, to take 
full advantage of the cross-correlation, spectroscopic red- 
shifts are probably required. This is because 21 cm arrays 
are primarily sensitive to line-of-sight modes, precisely 



those that redshift errors contaminate. Thus the techni- 
cal requirements for our surveys are substantially greater 
than, e.g., weak lensing, which only requires reasonably 
good photometric redshifts (so long as the distributions 
are understood extremely well). 

Measuring the cross-correlation also drives the design 
of low-frequency telescopes in specific directions. In par- 
ticular, statistical measurements like the power spectrum 
have a tradeoff between survey area and depth in any 
particular field. Either way is a viable method - for ex- 
ample, the MWA (which has a large field of view) and 
LOFAR (which has much more collecting area, but a 
smaller field of view) o ffer comparable constra ints on the 
21 cm power spectrum (McQuinn et al. 2005). However, 
given the difficulty of deep, wide galaxy surveys, we are 
unlikely to be able to cover hundreds of square degrees in 
the near future. The extra field of view of the radio tele- 
scope is then useless in regard to the cross-correlation: 
for these purposes, depth is more important than field 
of view. Another implication is that long baselines, even 
with a low filling factor, can be quite useful. Because 
the galaxy power spectrum can extend to much smaller 
scales than the 21 cm survey's angular coverage, it can 
pull out extra information on the small-scale behavior - 
but only if at least some baselines exist at the relevant 
scales. 

In this paper, we have focused on the detectability of 
this cross-correlation rather than the physical insight to 
be gleaned from it. Of course, the cross-correlation has 
obvious scientific uses, especially with regard to the re- 
lationship between the ionizing sources and the 21 cm 
line. The most basic observable is whether over- or un- 
derdense gas is ionized first ( "inside-out" versus "outsidc- 
in" reionizati on) , which should be measurable with the 
first surveys (|Wvithe fc Loebl r2006[ ). However, there is 
of course much more to be gained: for example, isolat- 
ing those galaxies responsible for the bulk of the ioniz- 
ing photons and testing the role of recombinations. An- 
other interesting application is the cro ss-correlation with 
differe nt galaxy samples; for example, iGunn fc Petersonl 
(|1965l ) absorption is expected to affect the Lya lines 
of galaxies during r eionization ( Mir alda-Escude fc Reesl 
Il998t lHaimani r2002). But the amount of absorption suf- 
fered by each galaxy depends on how near it is to ion- 
ized gas - so Lya-selected galaxies can indirectly map 
the distribution of HI and ioniz ed bubbles in the IGM 
(jFurlanetto et all l2004aL l2006cf ). Cross-correlation of 
such a population with the 21 cm emission would help 
to isolate the HII region struct ure. Given the complex- 
ity of the 21 cm sig nal (e.g., IFurlanetto et al.l l2004bl : 
IZahn et al"1l2006[ L these aspects are best explored with 
numerical simulations, which we defer to the future. 

The cross-correlation will also, of course, be use- 
ful even before reionization gets underway. In ear- 
lier phases, the galaxies seed fluctua tions in the spin 
tempe rature (through Lya coupling; iBarkana fc Loebl 
2005bh and the gas temp erature (through X-rays; 
Pritchard fc Furlanettol l2006| ) . both of which affect the 
21 cm brightness temperature. Thus, eventually we hope 
to push both galaxy surveys and the 21 cm observations 
to much higher redshifts: this will no doubt be difficult, 
given the rapid decline in the galaxy number density and 
the rapid increase in the 21 cm foregrounds. But under- 
standing the detailed properties of these earlier gener- 
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ations of galaxies would be an enormous payoff for the 
effort. 
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